DGCOHEHT SESOSE 



ED 123 250 



TH 005 309 



aOTHOS 
TITLE 

POB DATS 
HOTE 



EDKS PSICE 
DESCSIPTOSS 



IDEHTIFIE^S 



Hartin, Gerald S, 

The Estimation of Theta in the Integrated Hoving 
average Tiie-Series iJodel, 
[Apr 76] 

20p«; Paper presented at the &nnual Meeting of the 
American Educational Besearch Association (60th, San 
Francisco^ California, April 19-23, 1976) 

H7-$0#83 HC'$1,67 Plus Postage, 
♦Comparative Analysis; Computer Programs; *Data 
Analysis; ^Mathematical Models; Probability; 
Simulation; Statistical Analysis; *Time 
♦Integrated Moving Average Hodels; Monte Carlp 
Methods; Time Series Data 



ABSTBACT 

Through Monte Carlo procedures, three different 
techniques for estimating the parameter theta <proportion of the 
"chocks** remaining in the system) in the Integrated Moving Average 
i^t^j^) time' series model are compared in terms of (1) the accuracy 
of the estimates, (2) the independence of the estimates from the true 
value of theta, and (3) the independence of the estimates from a 
'shift in level' in the time-series following an intervention, in the 
^usunl" range for theta, the methods appear equally accurate. One 
produces complex estimates in special cases. Estimates are 
independent of the true value and changes in level, (Author) 



«««««««««;9e««>)t «««««««« j4t« «««««« «««iX«««4c«««««« 

* Documents acquired by ESIC include many informal unpublished * 

* materials not available from other sources, ERIC makes every effort * 

* to obtain the best copy available, Nevertheless^ it^ms of marginal * 

* reproducibility are often encountered and this affects the quality * 

* of the microfiche and hardcopy reproductions ESIC makes available * 

* via the ESIC Document Reproduction Service (EDRS), EDSS is not * 

* responsible for the quality of the original document. Reproductions * 

* supplied by EDKS are the best that can be made from the original, * 



ERIC 

'-^I^MIiffilffTlTLiU 



o 

<\i 
o 

^ The Estiniation of Iheta* 

in the 

Integrated Moving Average lime-Series Model 



Gerald R* Martin 
Vernon L. Hendrix 
Victor L. Willson 

NATIONAC INSrirVTE OF 
EDUCATION 

THIS DOCOweNT HAS BEEN ftecfio. 
Ooceo Exactly as J^ECEiVEO fftOM 

ThC PEtjSON OQOftOANirATtONORlGlNr 
AtlNC iT POINTS OF ViEWOtt OPINIONS 
StAtEO 00 NOT NECESSAttiLV PEPilE. 
SENT OFF K( At. NATION At INSTITUTE Of 
EOuCATtON POSITION O** COCiCV 



University of Minnesota 



o 



LO Session 12,14 

O American Educational Research Association 

San Francisco, 1976 



ERiC 



1, Background 

. The Integrated Moving Average (IMA) models for analysis of time 
series data have been increasingly useful in the behavioral sciences, 
including educational research* Specifically, these ntodels are well- 
suited for testing hypotheses arising from interventions in either 
experimental or non-experimental situations; the researcher can 
compare a variable's pattern of behavior before the intervention 
has occurred with its behavior afterwar<?s, and can do so without having 
to meet common assumptions of stochastic independence of observations 
(see Glass, Willson, and Gottman, 1975 for methods and examples*) 

Of these models, the model IM (0,1,1) is frequently identified 
as a good descriptor of sample time seiries aata* This model has the 
form 

(1.1) z^. - z^^^j^ = at - ©at-l 
where 2^ = observation or datum recorded at time period i, a^ = random 
"shock" at time i, and 0(theta) = a fixed constant. It postulates 
(in words) tSiat the difference between two consecutive observations 
is due to a random shock at the time of the current observation, minus 
(or plus, depending on the sign of 6) some fixed proportion (9) of 
shock "left over" froin the preceding observation. 

The single parameter 9 measures "carryover" of the influence of 
the random shocks; for reasons of mathematical stability, 9 must be in 
the interval (-1,+1), and so may indeed be thought of as a proportion. 

IMi (0,1,1) can be rearranged in various ways to incorporate 
parameters measuring patterns in the data, or changes in patterns 
coincident with interventions; such parameters may be used to measure 



Series levels change In level after Intervention, series drifts or 
change In serle5; drift after Intervention, 

For example^ appropriate rearrangement of .(1*1) yields 

(1.2) zt = ^ + (^^^5 1=1 ^1 + ^t * 

which expresses z as a sum (hence » Integrated moving average) of 
previous and current random shocks; the parameter L has been added to 
Indicated the "level" of the series previous to observation 1# A 
value of 1, may be estlirated from the data» given a suitable value of 9; 
more typically^ however, It is a change in series level that is of interest* 
By postulating (1#2) before a treatment event (or Interv^tlon) E occurs » 
and by postulating 

(1.3) 2t L + ^ (1-9) a. + at 

after E, one may estimate not only L> but estimate S (change in series 

level at E) as well. Once again, this estimation requires a suitably 

accurate value of 9, 

Other models Toay be derived^ and parameters defined as needed, A 

transformation of the raw data and utilization of the general linear model 

permits least*-squares estimates of these parameters of interest, along with 

appropriate tests of hypotheses using nothing more esoteric than Student's 

t-dlstrlbution (61ass» Willson^ and Gottman^ 1975» pp. 136 ff#); all such 

procedures, however » necessarily depend on the specific value of 9 used. 

Since 9 is itself generally unknown, some procodui«> must be used for 

finding the '^appropriate" value. 

Three such methods for "choosing'^ 9 have been suggested. The first 

& 2 

of these selects the value of 9 TJhich minimizes ^ a^ in the general 
linear model y = Xb + a; here» y is a column vector of transformed data 
defined by y^ = z^ and y^ = z^, - ^t-l"*" ^t-1^^^ X is the K x 2 "design" 

matrix whose (i,l)th entry is 9^"^ » and whose (l»2)th^ entry is 0 if 1 ^ nj^ 
and fj*^"^'! - t if l\n, (here n, = number of time points preceding the intervention 



E, and N = total number of time points in ths series); b is the vector[^3 * 

n - 

and a*^3 a column vector of random shocks (errors) a«« The quantity ^ 

t 1^1 » 

id easily confuted as (y -^) (y *^)C>)« This method yields the maximum 
likelihood estimate of theta. In what follows, we shall refer to this 
method as SSE or SSEHIN, for ^um of Squared Errors > MINim^ed,^ 

The second method is a Bayesian approach: we use the confuted value 
of « (y - Xb)'''(y - Xb)/(N - 2) to define the function h(9ls) = Ix^lf^a"^"^' , 
and choose 9 such that h is maximized* This method assumes an "uninformed'^ prior 
distribution. Box and Tiao (1965, p. 189) give an explicit formula for h for 
^the case of modds (1*2) and (1«3), Hereafter we shall refer to this procedure 
as PD or PDMAX, for ^^osterior Distribution MAX lmlaation*^ 

The third method merely solves for 9 in the theoretical identity 

(1.4) = -9 / (1 + 9^) 

(Box and Jenkins^ 1970, p. 69)> where ?j^is the lag-1 autocorrelation (which 
can easily be estimated from the data). We refer to this method as C0RR, 



5 



2* Objectives 

No decision rule exists for ''selecting" the "appropriate" value 
of theta^ In fact, no procedures are available for determining 
whether one method should be preferable to the others « Although the 
values of theta produced by the three methods are frequently in close 
agreement, there are instances in which they may differ widely* Three 
examples will illustrate the potential dif ficulties* 

Figures 1,2, and 3 represent time series generated from random 
numbers ^ and preassigned parameter values* In each case, an 
IM^ (0,1,1) model equivalent to (1*2) and (1^3) was used to generate 
the series, with n^ = 30, N = 60, L ^ 0,A^ 0, and 9 = AO. The 
error terms were HID (0,1)* The results are summarized below: 

SERIES SSEMIN 9 PDMAX 9 OORR 9 TRUE 9 

1 *77 -56 *25 AO 

2 .99 *99 AS AO 

3 . 99 -31 undefined AO 

Series 1 is distinguished by complete disagreem^t between the three 
methods, with differences on the order of *2* In Series 2, SSEMIN and 
PDMAX have "topped out," producing estimates at or near the upper limit 
of permissible values of 9; note, however, that CORK has produced a 
good estimate of 9* Series 3 displays yet another "pathological" 
situation: SSEMIN has topped out, H)HUC appears normal, and CORK has 
produced a complex estimate of 91 (The latter circumstance occurs 
whenever |^^|^*5) It should be noted here that these examples were not 
contrived; they appeared in the first 100 time series generated during 
the testing of the computer programs used in this study* 




*3 



Figure 1 A lime Series Defined by 3^ - - - ^ Asl^ - > for which 
— — - t t-i t t*-i 

SSEMIM § = .77, PDMAX t = .56, and CORK § = .25. (Raw data values are 
given below.) 



t 


t 


t 


a 

t 


t 


a 

t 


t 


a 

t 


1 


' -.50956 


16 


-.82147 


31 


.3505'i 


46 


-.299*^0 


2 


-.3t78& 


17 


-.24159 


32 


-.70319 


47 


-.91187 


3 


-1.75570 


18 


.50792 


33 


-.03351 


46 


-.339b4 




-1.30331 


19 


-1.10159 


3'^ ■ 


.15259 


49 


-.75124 


5 


.39748 


20 


-.62'+?C- 


35 


-.3330^^ 


50 . 


.36481 


6 


-1 .05543 


21 


.171(19 


36 


.90781 


51 


.52576 


■7 


-.T3269 


22 


-.27972 


37 


.90359 


52 


.73059 


8 


-2.1U251 


23 


1.2B653 


38 


3.39536 


53 


1.06632 


9 


-1.93i4H 


24 


1.58326 


39 


.43409 


5'+ 


-.41633 


10 


-.67561 


25 


.81504 


40 


2.88648 


55 


.85683 


11 


1.04^'^7 


26 


2.63036 


41 


-.27226 


56 


-.56898 


12 


1.94783 


27 


1 .-67359 


42 


1.29166 


57 


-.57721 


13 


.89106 


28 


2.a'^^45 


43 


.94709 


58 


-.46467 




-1.21^84 


29 


.99347 


44 


1.25019 


59 


-.01620 


15 


.05537 


30 


.77547 


45 


2.20323 


60 


1.81171 



ERIC 




Figure 2 A Time Series Defined by Bj. - Bj.,]^ = aj. - '^^t-l * "*^°^ lAich 
SSEMIN ^ - .99, PDMAX = .99, and CORE % = .45. (Raw data values are 
given below.) 



t 




1 


'•1.20872 


2 


-2.61541 


3 


-1.76947 


4 


-2.26526 


5 


-2.74066 


6 


-1.69193 


7 


-1.90799 


8 


-3.29326 


9 


-2.25422 


10 


-1.6127d 


U 


-2.34021 


12 


-3.37741 


13 


-1.19264 


H 


-2.82437 


15 


-3.27598 



t 


h 






16 


-2.65491 


17 


-i.55ce9 


18 


-2.92075 


19 


-1.091H6 


20 


-.05*1.91 


21 


-3.96347 


22 


-2.56271 


23 


-.89612 


24 


-1.43146 


?.5 


-1..57890 


26 


-1.01972 


27 


-1.20197 


28 


-1.25735 


29 


.16511 


30 


.14939 



3J -.25791 

32 -,54^B0 

33 .212B9 

34 -.74336 

35 -.33768 

36 -.2^049 

37 -.i^9230 
30 . -1.73034 

39 ' -2.74358 

40 .43743 

41 -1.81990 

42 -.72163 

43 -.63091 

44 -1.52007 

45 .72893 



t 


h 


46 


-.53320 


47 


-1.08796 


4U 


-1.94553 


49 


-1.97188 


50 


-2.54917 


51 


-1 .60747 


52 


-1.56289 


53 


-2.27819 


54 


-2.9943B 


55 


-2.89742 


56 


-i. 45638 


57 


-1.501b2 


56 


-2.53836 


59 


-2.06113 


60 


-3.33544 



ERIC 



8 




•I 



Figure 3 A Time Series Defined by a - a , « .4a^ , » for which 

— ^ t t-1 t t-1 

SSEMIK ^ = .99> FDMAX ^ = .31 > and C06R ^ id undefined. (Ravr data values 
are given below.) 



t 




t 


Zt 


I 


2.54108 


16 


5.04931 


2 


2.bl'ti>i 


17 


3.66878 


3 


2.b29s;o 


18 


5.55951 




3. 18150 


19 


5.71685 


■ 5 


4.01591 


20 


5.ba405 


6 


4.32939 


21 


6.06740 


7 


S. 07710 


22 


6. 56521 


e 


3.68292 


23 


b.9l7S7 


9 . 


3.07?;9l 


24 


8.31332 


10 


4.14799 


25 


6.34903 


u 


4.54090 


26 


7.45182 


12 


4.63839 


27 


6.7B753 


13 


4.685S2 


28 


8.71404 


14 


4.447«! 


29 


7.36904 


15 


4.08170 


30 


6.9856i) 



t 


Zt 


31 


6.97900 


32 


5.47227 


33 


5.75792 


34 


5.677b3 


35 


5.84184 


3b 


6.73658 


37 


5.32152 


38 


6.16077 


39 


4. 65867 


40 


5.23080 


41 


4.96766 


42 


2.20366 


43 


2.61Sd0 


44 


2.4994B 


46 


1.70197 



t 


Zt 


46 


2.64907 


47 


.99326 


46 


2.72786 


49 


3.19561 


50 


3.45374 


51 


2.6R422 


52 


3.755^5 


53 


4.28520 


54 


4.63624 


55 


3.8317B 


56 


3.03398 


57 


■ 3.62624 


5B 


5.14678 


59 


5.40953 


bO 


4.58756 



9 



8 



ERIC 



* Thue, we ask the foliating questions: 

(1) How accurately do the three methods estimate theta? 

(2) To what extent does each method's accuracy depend on the true 
value of theta? 

(3) To vhat extent does the value of another parameter in the model 
(namely, a change in series level: j ) influence the accuracy of 
each method? 

* 

3. Method 

"Monte Carlo** simulation techniques were deemed appropriate, and 
were utilized on the University of Minnesota's Control Data Cyber 74 
computer* 

Twenty populations of time series of the form showu in Cl»2) and 
(1#3) were defined; ten for which theta vas given a value of .99, .9, 
.7, .5, .3, .1, 0, -.3, -.5, and -.9fe, respectively, and delta was zero, 
and ten more with the same values o£ theta, and delta = .5. (More 
positive values than negative were used for theta because theta is 
nearly always positive in the real world*) For each of these 20 populations, 
1000 sample series were generated; each of these series had n^^ 30, 
N = 60, L = 0, and used random shocks that were normal. Independent, 
with mean 0 and variance 1, For each of the 20,000 sample series thus 
defined, theta was estimated from the data by the methods SSEMIN, PDMAX, 
and CORR; these numbers, plus the lag •* 1 autocorrelation (r.ef erred to 
hereafter as LAG) were retained, and descriptive statistics computed* 

For each preassigned value of theta, a Smirnov two-satnple goodness-* 
of"fit test was performed, comparing the distributions for which ^ = 0 

with those for which J- (Conover, 1971, pp. 309-314) 

10 



4. Results 
* 

* Descriptive statistics produced by the 20 conq>uter runs are 
displayed In Tables l-5# 

Table 1 shows that SSEMIH and PDMAX are comparably accurate over 
all values of 8 tested; the nieans are within «025 of the true values of 
0, except near the extremes, vhere differences of «09 or so can occur* 
The medians of SS£MIN and PDMAX are similarly accurate, and are generally 
better estimates near theta^s extreme values* The modes reflect the 
topplng*^out or bottoming-out effect notr<! previously* 

Table 2 shows all three methods to be o^ surprisingly consistent 
accuracy, In the sense that the distributions of § all have standard 
errors on the order of ,01, Independent of either 0 or^, 

Tablf^ 3 reveals (as one might expect) that as the true value of 8 
deviates from 0 (the midpoint of Its possible range of values) the 
distribution of estimates of 9 provided by SSEMIN and pDMAX become less^ 
and less symmotrlc* 

The evidence for CORR l^s somewhat less encouraging; although It is 
substantially easier to compute In practice than either SSEMIN or 
PDMAX, we see from Tables 1*^3 that the behavior of Its estimates is 
much less desirable than that of the other methods* Its mean ^ appears 
to be tolerably accurate only in the range 0 to ,6 or so (albeit the most 
common real-llfft range for 9)*, though less so than the other methods* 
It is both '^quicker" and "dirtier" than its companions* 

CORR does not show ^ tendency toward skewness at extreme values of 
true theta; this lack of "sensitivity", as well as part of the method^s 
general Inaccuracy, can be attributed to the fact that a large portion 
of the distributions tested had lag •* 1 autocorrelations (LAG her'^ that 



Table Ij Measures of Central Tendency Computed for Various Chosen Valuec of 
^ * Theta and Delta; Tabled Values are Estimates of Theta^ Based on 
1000 Computer-Generated Time Series, 



HEAW 6 lEDIAN © KODE ^ 

THU3 TRU3 



TH3XA(©) 


D2LTA(S] 


SS2 


PD 


CORK 




PD 


CORK 


SSE 


PD 


CORR 


-.99 


.0 




-.952 


-.950 


-.^^80 




-.989 


-.98^)- 






-.990 


-.990 


-.^^20 


-.99 


.5 




-.92^^ 


-.90k 


-.i^81 




-.957 


-.911 


-.^^75 




-.990 


-.990 


-.440 


-.5 


.0 




-.507 


-.513 


-.375 




-.516 


-.^^99 


-.366 




-.990 


-.990 




-.5 


.5 




-.517 


-.525 


-.385 




-.526 


-.511 


-.371 




-.990 


-.990 


-.370 


-.3 


.0 




-.320 


-.31^ 






-.321 


-.311 


-.223 




-.990 


-.320 


-.120 


-.3 


.5 




-.31^ 


-.309 


-.232 




-.308 


-.297 


-.217 




-.990 


-.990 


-.120 


.0 


.0 




-•00b 


-.009 


•Ojc 




-.002 


-.002 






-.030 


-.030 


.040 


.0 


.5 




.007 


.00^^ 


.051 




.005 


.00^^ 


.0it4 




.050 


.000 


.020 


.1 


.0 




.109 


.115 


.1^7 




.118 


.116 


.135 




.990 


.120 


.220 


.1 


.5 




.095 


.098 


.130 




.096 


.095 


.123 




.990 


.020 


.170 


.3 


.0 




.305 


.305 


.308 




.302 


.291 


.299 




.990 


.260 


.200 


.3 


.5 




.317 


.313 


.302 




.312 


.300 


.290 




.990 


.290 


.250 


.5 


.0 




.510 


.51k 


AZ? 




.523 


.505 






.990 


.990 


.510 


.5- 


.5 




.52'^ 


.521 


.^^26 




.521^ 


.506 


.^^l5 




.990 


.990 


.410 


.7 


.0 




.71^^ 


.717 


.U87 




.7^^5 


.712 


.^^7l 




.990 


.990 


.460 


.7 


.5 




.716 


.703 


'.W6 




.730 


.701 


MZ 




.990 


.99Ja- 




.9 


.0 




.377 


,890 


.521 




.963 


.905 


.516 




.990 


.990 


.490 


.9 


.5 




.831 


.873 


.519 




.930 


.882 


.515 




.990 


.990 


.510 


.99 


.0 




.926 


.9'^5 


.503 




.939 


.935 


.^^99 




.990 


.990 


.610 


.99 


.5 




.902 


.893 


.529 




.960 


.912 


.530 




.990 


.990 


.520 



ERIC 



TiaHLe 2t Iteasi'xcs of Variability Computed for Various Chosen Values of 
Theta and DeJta; Tstbled Values Refer to Estimates of Theta, 
Based on lOQO Compaiter-Gcnerated Time Series. 



STD. ERROR ^ STD. DEV. ^ VARIANCE ^ 



THU3 

tksta(Q) 


THUS ^ 
DST.TA(4) 




FD 


COBR 


SSH! 




CORS 


SS3 


FD 


CORK 


-.99 


.0 




.008 


.ooit- 


.008 




.256 


.117 


.193 




.066 


.Olll- 


.039 


-.99 






.007 


.003 


.007 




.2211. 


.083 


.186 




,050 


.007 


.035 


-.5 


.0 




.010 


.008 


.007 




.329 


.211.3 


.200 




.108 


.059 


.oito 


'-.5 


.5 




.010 


.007 


.007 




.320' 


.211 


.202 




.102 


.of^5 


.oin 


-.3 


.0 




.009 


.007 


.007 




.237 


.237 


.20if 




.083 


.056 


.011-2 


-.3 


.5 




.009 


.008 


.007 




.29^^ 


.211-3 


.213 




.087 


.059 


.045 


.0 


.0 




.009 


.007 


.006 




.295 


.235 


.201 




.087 


.055 


.040 


.0 


.5 




.009 


.007 


.007 




.276 


.212 


.212 




.076 


.0i^5 


.045 


.1 


.0 




.010 


.008 


.007 




.3011- 




.210 




.092 


.060 


.044 


.1 


.5 




.009 


.007 


.007 




.300 


.233 


.200 




.090 


.05!^ 


,043 


.3 


.0 




.009 


.007 


.007 




.292 


.232 


.207 




.035 


.o^ti- 


.043 


.3 


.5 




.010 


.003 


.007 




.303 


.250 


.203 




.092 


.063 


.043 


.5. 


.0 




,010 


.007 


.007 




.311 


.209 


.193 




.096 




.037 


.5 


.5 




.009 


.007 


.007 




,293 


.231 


.200 




.039 


.053 


.040 


.7 


.0 




,011 


.007 


.003 




035 


.213 


.185 




.113 


.045 


.034 


.7 


.5 




.010 


.007 


,008 




.305 


.210 


.185 




.093 




.034 


.9 


..0 




.011 


.005 


.003 




.3^6 


.1^^ 


,186 




.119 


.027 


.035 


.9 


.5 




.009 


.005 


.008 




.273 


.168 


.188 




.077 


.028 


.035 


.99 


.0 




.011 


.005 


.003 




.333 


.I'lC 


.183 




.lli^ 


.,022 


.033 


.99 


.5 




.010 


.006 


.008 




.312. 


.188 


.185 




.097 


.035 


.034 



*12- 



Table 3j SI:eu and Ilurtosis Goriputod for Various Chosen Values of Theta 
aiid Delta; Tabled Values Refer to Estimates of Theta, Based 
on 1000 Conputer-Generated Tine Scries • 



SKEW © ICUETOSIS ^ 



TflUE 


THIS 


333 


P3 








CCPJR 




oO 






n.6i7 


-.122 






219.007 


-.161 

— . JUJ 


-.99 






o.Ool 


12, 3w 










-•^39 


-.5 


• 0 






9 901 


-.275 




111 C^ifL 




m9 


• -.5 








1.589 


« POP 




11 1901 


l^i923 ■ 


",160 




eO 






1 .100 








X X • V 




-.3 


•5 




1.338 


.897 


*i3^2 




8.079 


9.8^5 


.357 


• U 


.0 




-•lU/ 




.190 






0 190 

9iJ/o 


1 •U'tO 


.0 






.021 


-.14'5 






6-523 


9.615 


.623 




.0 








.293 






Q.OQQ 






,5 




1^ 

-,^^02 


tOOo 


-.Oiii!- 




5.751 


8.?i^3 


.361 




.0 






-.73^^ 


.31^3 




8.W7 


10.133 ■ 




.3 


.5 






-1,0';2 


.1^15 




8.022 


10.172 


-,070 


.5 


.0 






-1.955 






12.837 


17,527 


-.195 


.5 


.5 




-2.669 


-1.970 


.21^2 




12,972 


15.1*^9 


-.193 


.7 


.0 




-3.9?'^ 




.166 




17.795 


35.009 


-.2t^7 


.7 


.5 




_i^.07^ 


-fj,.009 


.039 




20,560 


3l.?o^^ 


-.590 


.9 


.0 






-S.322 


.056 




2'^.3'^2 


9?.oit8 


-.^^32 


.9 


.5 




-6.127 


-8.276 


.015 




28.(^36 


89,053 


-.29s . 


.99 


.0 






-11.311 


.097 






1^!J^.187 


-,i^5 


.99 


.5 




-5.73r: 


-9.072 


-.135 




32.093 


87.367 


-,272 



ERIC 



14 



-13- 



fell out of range (see Table 5). Without this truncation, the XAG 
estimates provided good estimates of the true lag - 1 autocorrelation 
(which can then be transformed to theta via (1*4) )• Summary statistics 
of these distributions of nontruncated IA6 estimates appear in Table 4, 

(Table 5 also displays percentages of the sanples tested for vhich 
SSEMIN and/or PDMAX topped- or bottomed -out. This gives us a rough 
idea of the expected frequency of these situations*) 

Finally, ve note from Table 6 that ^most of the distributions 
generated by SSEMIN, FDMAX, and lAG shoved a theoretical dependence on the 
value o£S , whereas those distributions generated by COSR showed little 
dependence onS, The test statistic being evaluated is the longest 
vertical distance between the cumulative density functions of the two 
sample distributions under scrutiny (Conover, 1971, p. 310), 

5, Conclusions 

SSEMIN and FDMAX appear to estimate theta adequately in all ranges 
of true theta* CORR is less accurate, especially outside the range ,0 
to ,6,, although the lag - 1 autocorrelations (LAG) of samples are good 
estimators of the true autocorrelation ^j^. Practical problems in using 
each method include the very real possibility that an estimator will 
"top out" or bottom out", or, in the case of CORR, not exist. 



15 



•14- 



Table, Summary Statistics Conputed for Various Chosen Values of Theta 
and Itelta; Tabled Values Refer to Sstimates of the Lag-1 
Autocoarrclation, Based on ]000 Conpoiter-Gonerated Time Series. 



TRU3 
THSTACe)/ 

THUS LAG-1 THUE 
CORR2i:.ATlO;r(B) D2LTAC& 




VARIABILm 

STO. STD. 

ERROR VjIEIAJ[CE 


HiGiGsn iioiurrs 
sm i:uiiTosis 


-.99 / '^99 


.0 






MB 


.510 




.004 


.136 


.018 


-.336 


.246 


-.99 / *^99 


.5 






.H-57 


.370 




.004 


.137 


.019 


-.328 


-.134 


-.5 / MO 


.0 




,3^2 




.360 




0OO5 




.023 


-.294 


-.006 


-,5 / .'^OO 


.5 




.351 


.360 


.430 




.005 


.151 


.023 


-.378 


.011 


-.3 / .275 


.0 




.216 


.221 


.190 




.005 


.165 


.027 


-.182 


-.000 


-.3 / .275 


.5 




.207 


,211; 


.260 




.005 


.171 


.029 


-.230 


-.258 


.0 / .0 


.0 




-0O3O 


-.033 


-.040 




.006 


.179 


0O32 


,069 


-.130 


.0 / .0 


.5 




-.011.3 


-.044 


-.080 




0OO6 


.190 


.036 


.073 


-.010 


.1 /-.099 


.0 




-.132 


-0I35 


-o210 




,006 


.179 


.032 


.169 


-.319 


.1 /-.099 


.5 




-.120 


-.123 


-.170 




.006 


.182 


.033 


.234 


-.016 


.3 /-.275 


.0 




-.279 


-.292 


-.3'^0 




.005 


.162 


.026 


.274 


.117 


.3 ./-•275 


.5 




-.280 


-.291 


-.250 




,005 


.170 


.029 


.318 


-.043 


.5 /-Mo 


.0 




-.399 




-.390 




.005 


.146 


.021 


,267 


-.007 


.5 /-MO 


.5 




-.392 


-.400' 


-.410 




.005 


.145 


.021 


.347 


.002 


.7 /-A70 


.0 




'Ml 


-.466 


-.550 




.004 


.136 


.018 


.314 


.030 


.7 /-.^^70 


.5 




-.1^55 


-.461 


-.450 




.004 


.134 


.018 


.301 


-.211 


.9 /-.^f'97 


.0 




~M0 


-.4B4 


-.480 




.004 


.131 


.017 


.260 


.175 


.9 /'.^9? 


.5 




-.im 


-.W 


-.450 




,004 


.130 


.017 


,410 


.266 


.99 /-M9 


.0 






-.490 


-.560 




.004 


.332 


.017 


.307 


-.130 


.99 /-M9 


o5 




-Ml 


-.500 


-.520 




.004 


.127 


.016 


.495 


.491 



16 



t 



-15- 



Table 5i Percentage of 1000 Gonputer-Generated Tine Scries Judged 

. ' "Out of Eangc." For SSS and PD, BOP = .^distributions idth 
^~ -.99, and TOP = :3 Distributions withG^ o99 ; for LAG, 
30r = J5 Distributions with Pj ^ -.5, and TOP = Distributions 
tdth P, 2 .5 

THUS 

THSTACe)/ SSE PD LAG 



TRU3 UC-1 T?W3 ^ 

GOR!CTA?I0:r('?^'> DELTAf^y 3CT riD TOP BOT MP ' TOP BOT IZD T O? 



-.99 / .^99 .0 
-.99 / .^99 .5 




85.7 12,6 1.7 

32. s 65.9 1.3 




^9.6 50.2 0.2 
.1.9 80.0 0.1 




0.0 68.3 31.7 

0,0 Oc.%{ J/ ij 


-.5 / .^0 .0 
-.5 / .J^O .5 




9.7 87.1 3.2 
9.1 88.0 2.9 




7.5 91.5 1.0 




0.0 8^^.0 l6.C 
OjO 8f{..2 15. c 


-.3 / .275 .0 
-o3 / .275 .5 




5.6 92.1 2.3 
5.9 91.7 2.4 




3.3 95.0 1.1 
3.9 95.3 0.8 




0.0 96.3 3.; 

0.0 97.3 2.1 


.0 / .0 .0 
.0 / .0 .5 




3.7 93.0 3.3 
2.7 9*^»k 2.9 




1.5 97.^ i.l 
1.0 97,9 1.1 




0.2 99.5 0.^ 

0.^* 99.3 0,' 


.1 /-.099 .0 
.1 /-.099 .5 




3.6 92<.J^ h»0 
3.1 92.9 ^.0 




1.2 96.2 2.6 
0.9 97.4- 1»7 




1.3 93.7 o.( 

l<i3 9^.7 0.( 


.3 /-.275 .0 
o3 /-.275 »5 




2.5 92.1 5-^ 
2,7 91.0 6.3 




0»6 95.2 3'5 
1.1 94.9 4.0 




8,5 91.5 0.( 
9.9 90.1 0. 


,5 /-.400 .0 
.5 /-.400 .5 




2.8 89.0 8.2 
2.3 88.6 9.1 




0.4 94.3 5.3 
0.8 92.7 6.5 




26.9 73.1 0. 
26.0 74.0 0. 


.7 /-.470 .0 
.7 /-.470 .5 




3.1 79.7 17.2 
2.4 80.1 17.5 




0.8 86.6 12.6 
0.6 86.1 13.3 




42.5 57.5 0. 
41.0 59.0 0. 


.9 /-.497 .0 
.9 /-.497 .5 




3.2 53.5 43.3 
2.0 61.8 36.2 




0.6 71.6 27.8 
0.6 75.5 23.9 




47.0 53.0- 0. 
47.0 53.0 0. 


.99 /-.499 .0 
.99 /-.499 .5 




3.0 10.6 86.4 
2.6 64.0 33.4 




0.4 48.7 50.9 
0.8 87.1 12.1 




48.6 51.4 0. 
51.9 48.1 0. 





1.7 



-16- 



ERIC 



Table 6; Smirnov Two-Sample Test Statistics, Con^iaring 9 Distributions 

with<5= 0 to those with = ,5. * « Significant at alpha = ,05, 
** »« significant at alpha = ,01; all tests are 2<-tailed, 



TRUE 



THETA (9) 


SSEMIN 




OORR 


LAG 


-.99 


.895** 


.536** 


.057 


.100** 


-.5 


.098** 


.080** 


.054 


.068* 


-.3 


.064* 


.067* 


.046 


.050 


.0 


.072* 


.078**" 


.053 


.057 


.1 


.103** 


.105** 


.071 


.071* 


.3 


.065* 


.068* 


.048 


.056 


.5 


.091** 


.076** 


.049 


.051 


./ 


.175** 


.133** 


.051 


.062* 


.9 


.433** 


.278** 


.042 


.043 


.99 


.864** 


.525** 


.095* 


.075** 



IB 



-17- 



Each estimation method Is consistently accurate, In the sense that 
If the specific estimate 9 Is thought of as a san^ls chosen from ^ 
theoretical distribution of 9, then the standard error of the estimate 
Is likely to be less than *01* 

Although the presence of a change In level has little practical 



reveals {table 6) that the value of 6 does change the nature of the 
theoretical distribution of estimates of theta* 



impact on the estimated value 




other Investigation 



19 



References 

(1) 6lass» Gene V« > Wlllson> Victor I«> and 6ottman> JohnM«» 
Design and Analysis of Tlroe Series Experiments ^ Boulder : 
Colorado Associated Urdverslty Press > 1975* 

(2) Box> 6«B«P«> and Tlao> 6«C. "A Change In I<evel of a Hon-* 
Stationary Time-Series/' Blometrlka ^ 1965* 52: 181-192* 

(3) Box> G*E*P*> and Jenkins > G.M. Time Series Analysis: Forecasting 
and Control * San Francisco: Holden Day» 1970* 

(4) Conover> W*J* Practical Nonpatametrlc Statistics * New York* 
Wlley^ 1971* 



. 20 



